Hash-based systems and methods for detecting, preventing, and tracing network worms and viruses

ABSTRACT

A system ( 126 - 129 ) detects transmission of potentially malicious packets. The system ( 126 - 129 ) receives packets and generates hash values corresponding to each of the packets. The system ( 126 - 129 ) may then compare the generated hash values to hash values corresponding to prior packets. The system ( 126 - 129 ) determines that one of the packets is a potentially malicious packet when the generated hash value corresponding to the one packet matches one of the hash values corresponding to one of the prior packets and the one prior packet was received within a predetermined amount of time of the one packet. The system ( 126 - 129 ) may also facilitate the tracing of the path taken by a potentially malicious packet. In this case, the system ( 126 - 129 ) may receive a message that identifies a potentially malicious packet, generate hash values from the potentially malicious packet, and determine whether one or more of the generated hash values match hash values corresponding to previously-received packets. The system ( 126 - 129 ) may then identify the potentially malicious packet as one of the previously-received packets when one or more of the generated hash values match the hash value corresponding to the one previously-received packet.

RELATED APPLICATIONS

This application is a continuation of U.S. patent application Ser. No.10/654,771, filed Sep. 4, 2003, which, in turn, claims priority under 35U.S.C. §119 based on U.S. Provisional Application No. 60/407,975, filedSep. 5, 2002, both of which are incorporated herein by reference. U.S.patent application Ser. No. 10/654,771 is also a continuation-in-part ofU.S. patent application Ser. No. 10/251,403, filed Sep. 20, 2002, whichclaims priority under 35 U.S.C. §119 based on U.S. ProvisionalApplication No. 60/341,462, filed Dec. 14, 2001, both of which areincorporated herein by reference. U.S. patent application Ser. No.10/654,771 is also a continuation-in-part of U.S. patent applicationSer. No. 09/881,145, and U.S. patent application Ser. No. 09/881,074,both of which were filed on Jun. 14, 2001, and both of which claimpriority under 35 U.S.C. §119 based on U.S. Provisional Application No.60/212,425, filed Jun. 19, 2000, all of which are incorporated herein byreference.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates generally to network security and, moreparticularly, to systems and methods for detecting and/or preventing thetransmission of malicious packets, such as worms and viruses, andtracing their paths through a network.

2. Description of Related Art

Availability of low cost computers, high speed networking products, andreadily available network connections has helped fuel the proliferationof the Internet. This proliferation has caused the Internet to become anessential tool for both the business community and private individuals.Dependence on the Internet arises, in part, because the Internet makesit possible for multitudes of users to access vast amounts ofinformation and perform remote transactions expeditiously andefficiently. Along with the rapid growth of the Internet have comeproblems caused by malicious individuals or pranksters launching attacksfrom within the network. As the size of the Internet continues to grow,so does the threat posed by these individuals.

The ever-increasing number of computers, routers, and connections makingup the Internet increases the number of vulnerability points from whichthese malicious individuals can launch attacks. These attacks can befocused on the Internet as a whole or on specific devices, such as hostsor computers, connected to the network. In fact, each router, switch, orcomputer connected to the Internet may be a potential entry point fromwhich a malicious individual can launch an attack while remaininglargely undetected. Attacks carried out on the Internet often consist ofmalicious packets being injected into the network. Malicious packets canbe injected directly into the network by a computer, or a deviceattached to the network, such as a router or switch, can be compromisedand configured to place malicious packets onto the network.

One particularly troublesome type of attack is a self-replicatingnetwork-transferred computer program, such as a virus or worm, that isdesigned to annoy network users, deny network service by overloading thenetwork, or damage target computers (e.g., by deleting files). A virusis a program that infects a computer or device by attaching itself toanother program and propagating itself when that program is executed,possibly destroying files or wiping out memory devices. A worm, on theother hand, is a program that can make copies of itself and spreaditself through connected systems, using up resources in affectedcomputers or causing other damage.

In recent years, viruses and worms have caused major network performancedegradations and wasted millions of man-hours in clean-up operations incorporations and homes all over the world. Famous examples include the“Melissa” e-mail virus and the “Code Red” worm.

Various defenses, such as e-mail filters, anti-virus programs, andfirewall mechanisms, have been employed against viruses and worms, butwith limited success. The defenses often rely on computer-basedrecognition of known viruses and worms or block a specific instance of apropagation mechanism (i.e., block e-mail transfers of Visual BasicScript (.vbs) attachments). New viruses and worms have appeared,however, that evade existing defenses.

Accordingly, there is a need for new defenses to thwart the attack ofknown and yet-to-be-developed viruses and worms. There is also a need totrace the path taken by a virus or worm.

SUMMARY OF THE INVENTION

Systems and methods consistent with the present invention address theseand other needs by providing a new defense that attacks maliciouspackets, such as viruses and worms, at their most common denominator(i.e., the need to transfer a copy of their code over a network tomultiple target systems, where this code is generally the same for eachcopy, even though the rest of the message containing the virus or wormmay vary). The systems and methods also provide the ability to trace thepath of propagation back to the point of origin of the malicious packet(i.e., the place at which it was initially injected into the network).

In accordance with the principles of the invention as embodied andbroadly described herein, a system detects the transmission ofpotentially malicious packets. The system receives packets and generateshash values corresponding to each of the packets. The system may thencompare the generated hash values to hash values corresponding to priorpackets. The system may determine that one of the packets is apotentially malicious packet when the generated hash value correspondingto the one packet matches one of the hash values corresponding to one ofthe prior packets and the one prior packet was received within apredetermined amount of time of the one packet.

According to another implementation consistent with the presentinvention, a system for hampering transmission of a potentiallymalicious packet is disclosed. The system includes means for receiving apacket; means for generating one or more hash values from the packet;means for comparing the generated one or more hash values to hash valuescorresponding to prior packets; means for determining that the packet isa potentially malicious packet when the generated one or more hashvalues match one or more of the hash values corresponding to at leastone of the prior packets and the at least one of the prior packets wasreceived within a predetermined amount of time of the packet; and meansfor hampering transmission of the packet when the packet is determinedto be a potentially malicious packet.

According to yet another implementation consistent with the presentinvention, a method for detecting a path taken by a potentiallymalicious packet is disclosed. The method includes storing hash valuescorresponding to received packets; receiving a message identifying apotentially malicious packet; generating hash values from thepotentially malicious packet; comparing the generated hash values to thestored hash values; and determining that the potentially maliciouspacket was one of the received packets when one or more of the generatedhash values match the stored hash values.

BRIEF DESCRIPTION OF THE DRAWINGS

The accompanying drawings, which are incorporated in and constitute apart of this specification, illustrate the invention and, together withthe description, explain the invention. In the drawings,

FIG. 1 is a diagram of a system in which systems and methods consistentwith the present invention may be implemented;

FIG. 2 is an exemplary diagram of a security server of FIG. 1 accordingto an implementation consistent with the principles of the invention;

FIG. 3 is an exemplary diagram of packet detection logic according to animplementation consistent with the principles of the invention;

FIGS. 4A and 4B illustrate two possible data structures stored withinthe hash memory of FIG. 3 in implementations consistent with theprinciples of the invention;

FIG. 5 is a flowchart of exemplary processing for detecting and/orpreventing transmission of a malicious packet, such as a virus or worm,according to an implementation consistent with the principles of theinvention;

FIG. 6 is a flowchart of exemplary processing for identifying the pathtaken through a network by a malicious packet, such as a virus or worm,according to an implementation consistent with the principles of theinvention; and

FIG. 7 is a flowchart of exemplary processing for determining whether amalicious packet, such as a virus or worm, has been observed accordingto an implementation consistent with the principles of the invention.

DETAILED DESCRIPTION

The following detailed description of the invention refers to theaccompanying drawings. The same reference numbers in different drawingsmay identify the same or similar elements. Also, the following detaileddescription does not limit the invention. Instead, the scope of theinvention is defined by the appended claims and equivalents.

Systems and methods consistent with the present invention providemechanisms to detect and/or prevent the transmission of maliciouspackets and trace the propagation of the malicious packets through anetwork. Malicious packets, as used herein, may include viruses, worms,and other types of data with duplicated content, such as illegal masse-mail (e.g., spam), that are repeatedly transmitted through a network.

According to implementations consistent with the present invention, thecontent of a packet may be hashed to trace the packet through a network.In other implementations, the header of a packet may be hashed. In yetother implementations, some combination of the content and the header ofa packet may be hashed.

Exemplary System Configuration

FIG. 1 is a diagram of an exemplary system 100 in which systems andmethods consistent with the present invention may be implemented. System100 includes autonomous systems (ASs) 110-140 connected to publicnetwork (PN) 150. Connections made in system 100 may be via wired,wireless, and/or optical communication paths. While FIG. 1 shows fourautonomous systems connected to a single public network, there can bemore or fewer systems and networks in other implementations consistentwith the principles of the invention.

Public network 150 may include a collection of network devices, such asrouters (R1-R5) or switches, that transfer data between autonomoussystems, such as autonomous systems 110-140. In an implementationconsistent with the present invention, public network 150 takes the formof the Internet, an intranet, a public telephone network, a wide areanetwork (WAN), or the like.

An autonomous system is a network domain in which all network devices(e.g., routers) in the domain can exchange routing tables. Often, anautonomous system can take the form of a local area network (LAN), aWAN, a metropolitan area network (MAN), etc. An autonomous system mayinclude computers or other types of communication devices (referred toas “hosts”) that connect to public network 150 via an intruder detectionsystem (IDS), a firewall, one or more border routers, or a combinationof these devices.

Autonomous system 110, for example, includes hosts (H) 111-113 connectedin a LAN configuration. Hosts 111-113 connect to public network 150 viaan intruder detection system 114. Intruder detection system 114 mayinclude a commercially-available device that uses rule-based algorithmsto determine if a given pattern of network traffic is abnormal. Thegeneral premise used by an intruder detection system is that maliciousnetwork traffic will have a different pattern from normal, orlegitimate, network traffic.

Using a rule set, intruder detection system 114 monitors inbound trafficto autonomous system 110. When a suspicious pattern or event isdetected, intruder detection system 114 may take remedial action, or itcan instruct a border router or firewall to modify operation to addressthe malicious traffic pattern. For example, remedial actions may includedisabling the link carrying the malicious traffic, discarding packetscoming from a particular source address, or discarding packets addressedto a particular destination.

Autonomous system 120 contains different devices from autonomous system110. These devices aid autonomous system 120 in identifying and/orpreventing the transmission of potentially malicious packets withinautonomous system 120 and tracing the propagation of the potentiallymalicious packets through autonomous system 120 and, possibly, publicnetwork 150. While FIG. 1 shows only autonomous system 120 as containingthese devices, other autonomous systems, including autonomous system110, may include them.

Autonomous system 120 includes hosts (H) 121-123, intruder detectionsystem 124, and security server (SS) 125 connected to public network 150via a collection of devices, such as security routers (SR11-SR14)126-129. Hosts 121-123 may include computers or other types ofcommunication devices connected, for example, in a LAN configuration.Intruder detection system 124 may be configured similar to intruderdetection system 114.

Security server 125 may include a device, such as a general-purposecomputer or a server, that performs source path identification when amalicious packet is detected by intruder detection system 124 or asecurity router 126-129. While security server 125 and intruderdetection system 124 are shown as separate devices in FIG. 1, they canbe combined into a single unit performing both intrusion detection andsource path identification in other implementations consistent with thepresent invention.

FIG. 2 is an exemplary diagram of security sever 125 according to animplementation consistent with the principles of the invention. Whileone possible configuration of security server 125 is illustrated in FIG.2, other configurations are possible.

Security server 125 may include a processor 202, main memory 204, readonly memory (ROM) 206, storage device 208, bus 210, display 212,keyboard 214, cursor control 216, and communication interface 218.Processor 202 may include any type of conventional processing devicethat interprets and executes instructions.

Main memory 204 may include a random access memory (RAM) or a similartype of dynamic storage device. Main memory 204 may store informationand instructions to be executed by processor 202. Main memory 204 mayalso be used for storing temporary variables or other intermediateinformation during execution of instructions by processor 202. ROM 206may store static information and instructions for use by processor 202.It will be appreciated that ROM 206 may be replaced with some other typeof static storage device. Storage device 208, also referred to as a datastorage device, may include any type of magnetic or optical media andtheir corresponding interfaces and operational hardware. Storage device208 may store information and instructions for use by processor 202.

Bus 210 may include a set of hardware lines (conductors, optical fibers,or the like) that allow for data transfer among the components ofsecurity server 125. Display device 212 may be a cathode ray tube (CRT),liquid crystal display (LCD) or the like, for displaying information inan operator or machine-readable form. Keyboard 214 and cursor control216 may allow the operator to interact with security server 125. Cursorcontrol 216 may include, for example, a mouse. In an alternativeconfiguration, keyboard 214 and cursor control 216 can be replaced witha microphone and voice recognition mechanisms to enable an operator ormachine to interact with security server 125.

Communication interface 218 enables security server 125 to communicatewith other devices/systems via any communications medium. For example,communication interface 218 may include a modem, an Ethernet interfaceto a LAN, an interface to the Internet, a printer interface, etc.Alternatively, communication interface 218 can include any other type ofinterface that enables communication between security server 125 andother devices, systems, or networks. Communication interface 218 can beused in lieu of keyboard 214 and cursor control 216 to facilitateoperator or machine remote control and communication with securityserver 125.

As will be described in detail below, security server 125 may performsource path identification and/or prevention measures for a maliciouspacket that entered autonomous system 120. Security server 125 mayperform these functions in response to processor 202 executing sequencesof instructions contained in, for example, memory 204. Such instructionsmay be read into memory 204 from another computer-readable medium, suchas storage device 208, or from another device coupled to bus 210 orcoupled via communication interface 218.

Alternatively, hardwired circuitry may be used in place of or incombination with software instructions to implement the functions ofsecurity server 125. For example, the functionality may be implementedin an application specific integrated circuit (ASIC), afield-programmable gate array (FPGA), or the like, either alone or incombination with other devices.

Returning to FIG. 1, security routers 126-129 may include networkdevices, such as routers, that may detect and/or prevent thetransmission of malicious packets and perform source path identificationfunctions. Security routers 127-129 may include border routers forautonomous system 120 because these routers include connections topublic network 150. As a result, security routers 127-129 may includerouting tables for routers outside autonomous system 120.

FIG. 3 is an exemplary diagram of packet detection logic 300 accordingto an implementation consistent with the principles of the invention.Packet detection logic 300 may be implemented within a device that tapsone or more bidirectional links of a router, such as security routers126-129. In another implementation, packet detection logic 300 may beimplemented within a router, such as security routers 126-129. In thediscussion that follows, it may be assumed that packet detection logic300 is implemented within a security router.

Packet detection logic 300 may include hash processor 310 and hashmemory 320. Hash processor 310 may include a conventional processor, anASIC, a FPGA, or a combination of these that generates one or morerepresentations of each received packet and records the packetrepresentations in hash memory 320.

A packet representation will likely not be a copy of the entire packet,but rather it will include a portion of the packet or some unique valuerepresentative of the packet. Because modern routers can pass gigabitsof data per second, storing complete packets is not practical becausememories would have to be prohibitively large. By contrast, storing avalue representative of the contents of a packet uses memory in a muchmore efficient manner. By way of example, if incoming packets range insize from 256 bits to 1000 bits, a fixed width number may be computedacross fixed-sized blocks making up the content (or payload) of a packetin a manner that allows the entire packet to be identified. To furtherillustrate the use of representations, a 32-bit hash value, or digest,may be computed across fixed-sized blocks of each packet. Then, the hashvalue may be stored in hash memory 320 or may be used as an index, oraddress, into hash memory 320. Using the hash value, or an index derivedtherefrom, results in efficient use of hash memory 320 while stillallowing the content of each packet passing through packet detectionlogic 300 to be identified.

Systems and methods consistent with the present invention may use anystorage scheme that records information about each packet in aspace-efficient fashion, that can definitively determine if a packet hasnot been observed, and that can respond positively (i.e., in apredictable way) when a packet has been observed. Although systems andmethods consistent with the present invention can use virtually anytechnique for deriving representations of packets, for brevity, theremaining discussion will use hash values as exemplary representationsof packets having passed through a participating router.

Hash processor 310 may determine a hash value over successive,fixed-sized blocks in the payload field (i.e., the contents) of anobserved packet. For example, hash processor 310 may hash eachsuccessive 64-byte block following the header field. As described inmore detail below, hash processor 310 may use the hash results of thehash operation to recognize duplicate occurrences of packet content andraise a warning if it detects packets with replicated content within ashort period of time. Hash processor 310 may also use the hash resultsfor tracing the path of a malicious packet through the network.

The hash value may be determined by taking an input block of data, suchas a 64-byte block of a packet, and processing it to obtain a numericalvalue that represents the given input data. Suitable hash functions arereadily known in the art and will not be discussed in detail herein.Examples of hash functions include the Cyclic Redundancy Check (CRC) andMessage Digest 5 (MD5).

The resulting hash value, also referred to as a message digest or hashdigest, is a fixed length value. The hash value serves as a signaturefor the data over which it was computed. For example, incoming packetscould have fixed hash value(s) computed over their content.

The hash value essentially acts as a fingerprint identifying the inputblock of data over which it was computed. Unlike fingerprints, however,there is a chance that two very different pieces of data will hash tothe same value, resulting in a hash collision. An acceptable hashfunction should provide a good distribution of values over a variety ofdata inputs in order to prevent these collisions. Because collisionsoccur when different input blocks result in the same hash value, anambiguity may arise when attempting to associate a result with aparticular input.

Hash processor 310 may store a representation of each packet it observesin hash memory 320. Hash processor 310 may store the actual hash valuesas the packet representations or it may use other techniques forminimizing storage requirements associated with retaining hash valuesand other information associated therewith. A technique for minimizingstorage requirements may use a bit array or Bloom filters for storinghash values.

Rather than storing the actual hash value, which can typically be on theorder of 32 bits or more in length, hash processor 310 may use the hashvalue as an index for addressing a bit array within hash memory 320. Inother words, when hash processor 310 generates a hash value for afixed-sized block of a packet, the hash value serves as the addresslocation into the bit array. At the address corresponding to the hashvalue, one or more bits may be set at the respective location thusindicating that a particular hash value, and hence a particular datapacket content, has been seen by hash processor 310. For example, usinga 32-bit hash value provides on the order of 4.3 billion possible indexvalues into the bit array. Storing one bit per fixed-sized block ratherthan storing the block itself, which can be 512 bits long, produces acompression factor of 1:512. While bit arrays are described by way ofexample, it will be obvious to those skilled in the relevant art, thatother storage techniques may be employed with out departing from thespirit of the invention.

Over time, hash memory 320 may fill up and the possibility ofoverwriting an existing index value increases. The risk of overwritingan index value may be reduced if the bit array is periodically flushedto other storage media, such as a magnetic disk drive, optical media,solid state drive, or the like. Alternatively, the bit array may beslowly and incrementally erased. To facilitate this, a time-table may beestablished for flushing the bit array. If desired, the flushing cyclecan be reduced by computing hash values only for a subset of the packetspassing through the router. While this approach reduces the flushingcycle, it increases the possibility that a target packet may be missed(i.e., a hash value is not computed over a portion of it).

FIGS. 4A and 4B illustrate two possible data structures that may bestored within hash memory 320 in implementations consistent with theprinciples of the invention. As shown in FIG. 4A, hash memory 320 mayinclude indicator fields 412 and counter fields 414 addressable bycorresponding hash addresses 416. Hash addresses 416 may correspond topossible hash values generated by hash processor 310.

Indicator field 412 may store one or more bits that indicate whether apacket blockwith the corresponding hash value has been observed by hashprocessor 310. Counter field 412 may record the number of occurrences ofpacket blocks with the corresponding hash value. Counter field 412 mayperiodically decrement its count for flushing purposes.

As shown in FIG. 4B, hash memory 320 may store additional informationrelating to a packet. For example, hash memory 320 may include linkidentifier (ID) fields 422 and status fields 424. Link ID field 422 maystore information regarding the particular link upon which the packetarrived at packet detection logic 400. Status field 424 may storeinformation to aid in monitoring the status of packet detection logic400 or the link identified by link ID field 422.

In an alternate implementation consistent with the principles of theinvention, hash memory 320 may be preprogrammed to store hash valuescorresponding to known malicious packets, such as known viruses andworms. Hash memory 320 may store these hash values separately from thehash values of observed packets. In this case, hash processor 310 maycompare a hash value for a received packet to not only the hash valuesof previously observed packets, but also to hash values of knownmalicious packets.

In yet another implementation consistent with the principles of theinvention, hash memory 320 may be preprogrammed to store sourceaddresses of known sources of legitimate duplicated content, such aspackets from a multicast server, a popular page on a web server, anoutput from a mailing list “exploder” server, or the like. In this case,hash processor 310 may compare the source address for a received packetto the source addresses of known sources of legitimate duplicatedcontent.

Exemplary Processing for Malicious Packet Detection

FIG. 5 is a flowchart of exemplary processing for detecting and/orpreventing transmission of a malicious packet, such as a virus or worm,according to an implementation consistent with the principles of theinvention. The processing of FIG. 5 may be performed by packet detectionlogic 300 within a tap device, a security router, such as securityrouter 126, or other devices configured to detect and/or preventtransmission of malicious packets. In other implementations, one or moreof the described acts may be performed by other systems or deviceswithin system 100.

Processing may begin when packet detection logic 300 receives, orotherwise observes, a packet (act 505). Hash processor 310 may generateone or more hash values by hashing successive, fixed-sized blocks fromthe packet's payload field (act 510). Hash processor 310 may use aconventional technique to perform the hashing operation.

Hash processor 310 may optionally compare the generated hash value(s) tohash values of known viruses and/or worms within hash memory 320 (act515). In this case, hash memory 320 may be preprogrammed to store hashvalues corresponding to known viruses and/or worms. If one or more ofthe generated hash values match one of the hash values of known virusesand/or worms, hash processor 310 may take remedial actions (acts 520 and525). The remedial actions may include raising a warning for a humanoperator, delaying transmission of the packet, requiring humanexamination before transmission of the packet, dropping the packet andpossibly other packets originating from the same Internet Protocol (IP)address as the packet, sending a Transmission Control Protocol (TCP)close message to the sender thereby preventing complete transmission ofthe packet, disconnecting the link on which the packet was received,and/or corrupting the packet content in a way likely to render any codecontained therein inert (and likely to cause the receiver to drop thepacket).

If the generated hash value(s) do not match any of the hash values ofknown viruses and/or worms, or if such a comparison was not performed,hash processor 310 may optionally determine whether the packet's sourceaddress indicates that the packet was sent from a legitimate source ofduplicated packet content (i.e., a legitimate “replicator”) (act 530).For example, hash processor 310 may maintain a list of legitimatereplicators in hash memory 320 and check the source address of thepacket with the addresses of legitimate replicators on the list. If thepacket's source address matches the address of one of the legitimatereplicators, then hash processor 310 may end processing of the packet.For example, processing may return to act 505 and await receipt of thenext packet.

Otherwise, hash processor 310 may determine whether any prior packetswith the same hash value(s) have been received (act 535). For example,hash processor 310 may use each of the generated hash value(s) as anaddress into hash memory 320. Hash processor 310 may then examineindicator field 412 (FIG. 4) at each address to determine whether theone or more bits stored therein indicate that a prior packet has beenreceived.

If there were no prior packets received with the same hash value(s),then hash processor 310 may record the generated hash value(s) in hashmemory 320 (act 540). For example, hash processor 310 may set the one ormore bits stored in indicator field 412, corresponding to each of thegenerated hash values, to indicate that the corresponding packet wasobserved by hash processor 310. Processing may then return to act 505 toawait receipt of the next packet.

If hash processor 310 determines that a prior packet has been observedwith the same hash value, hash processor 310 may determine whether thepacket is potentially malicious (act 545). Hash processor 310 may use aset of rules to determine whether to identify a packet as potentiallymalicious. For example, the rules might specify that more than x (wherex>1) packets with the same hash value have to be observed by hashprocessor 310 before the packets are identified as potentiallymalicious. The rules might also specify that these packets have to havebeen observed by hash processor 310 within a specified period of time ofone another. The reason for the latter rule is that, in the case ofmalicious packets, such as viruses and worms, multiple packets willlikely pass through packet detection logic 300 within a short period oftime.

A packet may contain multiple hash blocks that partially match hashblocks associated with prior packets. For example, a packet thatincludes multiple hash blocks may have somewhere between one and all ofits hashed content blocks match hash blocks associated with priorpackets. The rules might specify the number of blocks and/or the numberand/or length of sequences of blocks that need to match before hashprocessor 310 identifies the packet as potentially malicious.

When hash processor 310 determines that the packet is not malicious(e.g., not a worm or virus), such as when less than x number of packetswith the same hash value or less than a predetermined number of thepacket blocks with the same hash values are observed or when the packetsare observed outside the specified period of time, hash processor 310may record the generated hash value(s) in hash memory 320 (act 540). Forexample, hash processor 310 may set the one or more bits stored inindicator field 412, corresponding to each of the generated hash values,to indicate that the corresponding packet was observed by hash processor310. Processing may then return to act 505 to await receipt of the nextpacket.

When hash processor 310 determines that the packet may be malicious,then hash processor 310 may take remedial actions (act 550). In somecases, it may not be possible to determine whether the packet isactually malicious because there is some probability that there was afalse match or a legitimate replication. As a result, hash processor 310may determine the probability of the packet actually being maliciousbased on information gathered by hash processor 310.

The remedial actions may include raising a warning for a human operator,saving the packet for human analysis, dropping the packet, corruptingthe packet content in a way likely to render any code contained thereininert (and likely to cause the receiver to drop the packet), delayingtransmission of the packet, requiring human examination beforetransmission of the packet, dropping other packets originating from thesame IP address as the packet, sending a TCP close message to the senderthereby preventing complete transmission of the packet, and/ordisconnecting the link on which the packet was received. Some of theremedial actions, such as dropping or corrupting the packet, may beperformed when the probability that the packet is malicious is abovesome threshold. This may greatly slow the spread rate of a virus or wormwithout completely stopping legitimate traffic that happened to match asuspect profile.

Exemplary Processing for Source Path Identification

FIG. 6 is a flowchart of exemplary processing for identifying the pathtaken through a network by a malicious packet, such as a virus or worm,according to an implementation consistent with the principles of theinvention. The processing of FIG. 6 may be performed by a securityserver, such as security server 125, or other devices configured totrace the paths taken by malicious packets. In other implementations,one or more of the described acts may be performed by other systems ordevices within system 100.

Processing may begin with intruder detection system 124 detecting amalicious packet. Intruder detection system 124 may use conventionaltechniques to detect the malicious packet. For example, intruderdetection system 124 may use rule-based algorithms to identify a packetas part of an abnormal network traffic pattern. When a malicious packetis detected, intruder detection system 124 may notify security server125 that a malicious packet has been detected within autonomous system120. The notification may include the malicious packet or portionsthereof along with other information useful for security server 125 tobegin source path identification. Examples of information that intruderdetection system 124 may send to security server 125 along with themalicious packet include time-of-arrival information, encapsulationinformation, link information, and the like.

After receiving the malicious packet, security server 125 may generate aquery that includes the malicious packet and any additional informationdesirable for facilitating communication with participating routers,such as security routers 126-129 (acts 605 and 610). Examples ofadditional information that may be included in the query are, but arenot limited to, destination addresses for participating routers,passwords required for querying a router, encryption keying information,time-to-live (TTL) fields, information for reconfiguring routers, andthe like. Security server 125 may then send the query to securityrouter(s) located one hop away (act 615). The security router(s) mayanalyze the query to determine whether they have seen the maliciouspacket. To make this determination, the security router(s) may useprocessing similar to that described below with regard to FIG. 7.

After processing the query, the security router(s) may send a responseto security server. The response may indicate that the security muterhas seen the malicious packet, or alternatively, that it has not. It isimportant to observe that the two answers are not equal in their degreeof certainty. If a security router does not have a hash matching themalicious packet, the security router has definitively not seen themalicious packet. If the security router has a matching hash, however,then the security router has seen the malicious packet or a packet thathas the same hash value as the malicious packet. When two differentpackets, having different contents, hash to the same value it isreferred to as a hash collision.

The security router(s) may also forward the query to other routers ordevices to which they are connected. For example, the security router(s)may forward the query to the security router(s) that are located twohops away from security server, which may forward the query to securityrouter(s) located three hops away, and so on. This forwarding maycontinue to include routers or devices within public network 150 ifthese routers or devices have been configured to participate in thetracing of the paths taken by malicious packets. This approach may becalled an inward-out approach because the query travels a path thatextends outward from security server 125. Alternatively, an outward-inapproach may be used.

Security server 125 receives the responses from the security routersindicating whether the security routers have seen the malicious packet(acts 620 and 625). If a response indicates that the security router hasseen the malicious packet, security server 125 associates the responseand identification (ID) information for the respective security muterwith active path data (act 630). Alternatively, if the responseindicates that the security router has not seen the malicious packet,security server 125 associates the response and the ID information forthe security router with inactive path data (act 635).

Security server 125 uses the active and inactive path data to build atrace of the potential paths taken by the malicious packet as ittraveled, or propagated, across the network (act 640). Security server125 may continue to build the trace until it receives all the responsesfrom the security routers (acts 640 and 645). Security server 125 mayattempt to build a trace with each received response to determine theingress point for the malicious packet. The ingress point may identifywhere the malicious packet entered autonomous system 120, public network150, or another autonomous system.

As security server 125 attempts to build a trace of the path taken bythe malicious packet, several paths may emerge as a result of hashcollisions occurring in the participating routers. When hash collisionsoccur, they act as false positives in the sense that security server 125interprets the collision as an indication that the malicious packet hasbeen observed. Fortunately, the occurrences of hash collisions can bemitigated. One mechanism for reducing hash collisions is to computelarge hash values over the packets since the chances of collisions riseas the number of bits comprising the hash value decreases. Anothermechanism to reduce false positives resulting from collisions is foreach security router (e.g., security routers 126-129) to implement itsown unique hash function. In this case, the same collision will notoccur in other security routers.

A further mechanism for reducing collisions is to control the density ofthe hash tables in the memories of participating routers. That is,rather than computing a single hash value and setting a single bit foran observed packet, a plurality of hash values may be computed for eachobserved packet using several unique hash functions. This produces acorresponding number of unique hash values for each observed packet.While this approach fills the hash table at a faster rate, the reductionin the number of hash collisions makes the tradeoff worthwhile in manyinstances. For example, Bloom Filters may be used to compute multiplehash values over a given packet in order to reduce the number ofcollisions and, hence, enhance the accuracy of traced paths.

When security server 125 has determined an ingress point for themalicious packet, it may notify intruder detection system 124 that theingress point for the malicious packet has been determined (act 650).Security server 125 may also take remedial actions (act 655). Often itwill be desirable to have the participating router closest to theingress point close off the ingress path used by the malicious packet.As such, security server 125 may send a message to the respectiveparticipating router instructing it to close off the ingress path usingknown techniques.

Security server 125 may also archive copies of solutions generated, datasent, data received, and the like either locally or remotely.Furthermore, security server 125 may communicate information aboutsource path identification attempts to devices at remote locationscoupled to a network. For example, security server 125 may communicateinformation to a network operations center, a redundant security server,or to a data analysis facility for post processing.

Exemplary Processing for Determining Whether a Malicious Packet has beenObserved

FIG. 7 is a flowchart of exemplary processing for determining whether amalicious packet, such as a virus or worm, has been observed accordingto an implementation consistent with the principles of the invention.The processing of FIG. 7 may be performed by packet detection logic 300implemented within a security router, such as security router 126, or byother devices configured to trace the paths taken by malicious packets.In other implementations, one or more of the described acts may beperformed by other systems or devices within system 100.

Processing may begin when security router 126 receives a query fromsecurity server 125 (act 705). As described above, the query may includea TTL field. A TTL field may be employed because it provides anefficient mechanism for ensuring that a security router responds only torelevant, or timely, queries. In addition, employing TTL fields mayreduce the amount of data traversing the network between security server125 and participating routers because queries with expired TTL fieldsmay be discarded.

If the query includes a TTL field, security router 126 may determine ifthe TTL field in the query has expired (act 710). If the TTL field hasexpired, security router 126 may discard the query (act 715). If the TTLfield has not expired, security router 126 may hash the malicious packetcontained within the query at each possible starting offset within ablock (act 720). Security router 126 may generate multiple hash valuesbecause the code body of a virus or worm may appear at any arbitraryoffset within the packet that carries it (e.g., each copy may have ane-mail header attached that differs in length for each copy).

Security router 126 may then determine whether any of the generated hashvalues match one of the recorded hash values in hash memory 320 (act725). Security router 126 may use each of the generated hash values asan address into hash memory 320. At each of the addresses, securityrouter 126 may determine whether indicator field 412 indicates that aprior packet with the same hash value has been observed. If none of thegenerated hash values match a hash value in hash memory 320, securityrouter 126 does not forward the query (act 730), but instead may send anegative response to security server 125 (act 735).

If one or more of the generated hash values match a hash value in hashmemory 320, however, security router 126 may forward the query to all ofits output ports excluding the output port in the direction from whichthe query was received (act 740). Security router 126 may also send apositive response to security server 125, indicating that the packet hasbeen observed (act 745). The response may include the address ofsecurity router 126 and information about observed packets that havepassed through security router 126.

Conclusion

Systems and methods consistent with the present invention providemechanisms to detect and/or prevent transmission of malicious packets,such as viruses and worms, and trace the propagation of the maliciouspackets through a network.

The foregoing description of preferred embodiments of the presentinvention provides illustration and description, but is not intended tobe exhaustive or to limit the invention to the precise form disclosed.Modifications and variations are possible in light of the aboveteachings or may be acquired from practice of the invention.

For example, systems and methods have been described with regard tonetwork-level devices. In other implementations, the systems and methodsdescribed herein may be used with a stand-alone device at the input oroutput of a network link or at other protocol levels, such as in mailrelay hosts (e.g., Simple Mail Transfer Protocol (SMTP) servers).

While series of acts have been described with regard to the flowchartsof FIGS. 5-7, the order of the acts may differ in other implementationsconsistent with the principles of the invention. In addition,non-dependent acts may be performed concurrently.

Further, certain portions of the invention have been described as“logic” that performs one or more functions. This logic may includehardware, such as an application specific integrated circuit or a fieldprogrammable gate array, software, or a combination of hardware andsoftware.

No element, act, or instruction used in the description of the presentapplication should be construed as critical or essential to theinvention unless explicitly described as such. Also, as used herein, thearticle “a” is intended to include one or more items. Where only oneitem is intended, the term “one” or similar language is used. The scopeof the invention is defined by the claims and their equivalents.

1. A method for detecting transmission of malicious packets, comprising:receiving a plurality of packets; generating hash values correspondingto the packets; comparing the generated hash values to hash valuescorresponding to prior packets; and determining that one of the packetsis a potentially malicious packet when the generated hash valuecorresponding to the one packet matches one of the hash valuescorresponding to one of the prior packets and the one prior packet wasreceived within a predetermined amount of time of the one packet.
 2. Themethod of claim 1, wherein the generating hash values includes: hashinga payload field in each of the packets to generate the hash values. 3.The method of claim 2, wherein the hashing a payload field includes:hashing successive fixed-sized blocks in the payload field in each ofthe packets.
 4. The method of claim 1, further comprising: storing aplurality of hash values corresponding to known malicious packets. 5.The method of claim 4, further comprising: comparing the generated hashvalues to the hash values corresponding to the known malicious packets;and declaring that one of the packets is a malicious packet when one ormore of the generated hash values corresponding to the one packetmatches one or more of the hash values corresponding to the knownmalicious packets.
 6. The method of claim 5, further comprising: takingremedial action when the one packet is declared a malicious packet. 7.The method of claim 6, wherein the taking remedial action includes atleast one of: raising a warning, delaying transmission of the onepacket, requiring human examination of the one packet, dropping the onepacket, dropping other packets originating from a same address as theone packet, sending a Transmission Control Protocol (TCP) close messageto a sender of the one packet, disconnecting a link on which the onepacket was received, and corrupting the one packet.
 8. The method ofclaim 1, further comprising: determining whether more than a predefinednumber of the prior packets with the matching hash value was received.9. The method of claim 8, wherein the determining that one of thepackets is a potentially malicious packet includes: identifying the onepacket as a potentially malicious packet when more than the predefinednumber of the prior packets was received within the predetermined amountof time of the one packet.
 10. The method of claim 8, furthercomprising: recording the generated hash value corresponding to the onepacket when no more than the predefined number of the prior packets wasreceived.
 11. The method of claim 1, wherein the potentially maliciouspacket is associated with one of a virus and a worm.
 12. The method ofclaim 1, further comprising: taking remedial action when the one packetis determined to be a potentially malicious packet.
 13. The method ofclaim 12, wherein the taking remedial action includes at least one of:raising a warning, delaying transmission of the one packet, requiringhuman examination of the one packet, dropping the one packet, droppingother packets originating from a same address as the one packet, sendinga Transmission Control Protocol (TCP) close message to a sender of theone packet, disconnecting a link on which the one packet was received,and corrupting the one packet.
 14. The method of claim 12, wherein thetaking remedial action includes at least one of: determining aprobability value associated with whether the one packet is apotentially malicious packet, and performing a remedial action when theprobability value is above a threshold.
 15. The method of claim 1,further comprising: comparing a source address associated with the onepacket to addresses of legitimate replicators, and determining that theone packet is not malicious when the source address matches one of theaddresses of legitimate replicators.
 16. A system for hamperingtransmission of a potentially malicious packet, comprising: means forreceiving a packet; means for generating one or more hash values fromthe packet; means for comparing the generated one or more hash values tohash values corresponding to prior packets; means for determining thatthe packet is a potentially malicious packet when the generated one ormore hash values match one or more of the hash values corresponding toat least one of the prior packets and the at least one of the priorpackets was received within a predetermined amount of time of thepacket; and means for hampering transmission of the packet when thepacket is determined to be a potentially malicious packet.
 17. A systemfor detecting transmission of potentially malicious packets, comprising:a plurality of input ports configured to receive a plurality of packets;a plurality of output ports configured to transmit the packets; a hashprocessor configured to: observe each of the packets received at theinput ports, generate hash values corresponding to the packets, comparethe generated hash values to hash values corresponding to previouspackets, and determine that one of the packets is a potentiallymalicious packet when one or more of the generated hash valuescorresponding to the one packet matches one or more of the hash valuescorresponding to one of the previous packets and the one previous packetwas received within a predetermined amount of time of the one packet.18. The system of claim 17, wherein when generating hash values, thehash processor is configured to hash a payload field in each of thepackets.
 19. The system of claim 18, wherein when hashing the payloadfield, the hash processor is configured to hash successive fixed-sizedblocks in the payload field in each of the packets.
 20. The system ofclaim 17, further comprising: a hash memory configured to store aplurality of hash values corresponding to known malicious packets.